Deep Learning Policy Quantization
نویسندگان
چکیده
We introduce a novel type of actor-critic approach for deep reinforcement learning which is based on learning vector quantization. We replace the softmax operator of the policy with a more general and more flexible operator that is similar to the robust soft learning vector quantization algorithm. We compare our approach to the default A3C architecture on three Atari 2600 games and a simplistic game called Catch. We show that the proposed algorithm outperforms the softmax architecture on Catch. On the Atari games, we observe a nonunanimous pattern in terms of the best performing model.
منابع مشابه
P-V-L Deep: A Big Data Analytics Solution for Now-casting in Monetary Policy
The development of new technologies has confronted the entire domain of science and industry with issues of big data's scalability as well as its integration with the purpose of forecasting analytics in its life cycle. In predictive analytics, the forecast of near-future and recent past - or in other words, the now-casting - is the continuous study of real-time events and constantly updated whe...
متن کاملDeep Reinforcement Learning of Video Games
The ability to learn is arguably the most crucial aspect of human intelligence. In reinforcement learning, we attempt to formalize a certain type of learning that is based on rewards and penalties. These supervisory signals should guide an agent to learn optimal behavior. In particular, this research focuses on deep reinforcement learning, where the agent should learn to play video games solely...
متن کاملAccurate Deep Representation Quantization with Gradient Snapping Layer for Similarity Search
Recent advance of large scale similarity search involves using deeply learned representations to improve the search accuracy and use vector quantization methods to increase the search speed. However, how to learn deep representations that strongly preserve similarities between data pairs and can be accurately quantized via vector quantization remains a challenging task. Existing methods simply ...
متن کاملCompetitive Reinforcement Learning in Continuous Control Tasks
This paper describes a novel hybrid reinforcement learning algorithm, Sarsa Learning Vector Quantization (SLVQ), that leaves the reinforcement part intact but employs a more effective representation of the policy function using a piecewise constant function based upon “policy prototypes.” The prototypes correspond to the pattern classes induced by the Voronoi tessellation generated by self-orga...
متن کاملDeep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding
Deep Compression is a three stage compression pipeline: pruning, quantization and Huffman coding. Pruning reduces the number of weights by 10x, quantization further improves the compression rate between 27x and 31x. Huffman coding gives more compression: between 35x and 49x. The compression rate already included the metadata for sparse representation. Deep Compression doesn’t incur loss of accu...
متن کامل